Quantitative association between gene expression and blood cell production of individual hematopoietic stem cells in mice

Individual hematopoietic stem cells (HSCs) produce different amounts of blood cells upon transplantation. Taking advantage of the intercellular variation, we developed an experimental and bioinformatic approach to evaluating the quantitative association between gene expression and blood cell production across individual HSCs. We found that most genes associated with blood production exhibit the association only at some levels of blood production. By mapping gene expression with blood production, we identified four distinct patterns of their quantitative association. Some genes consistently correlate with blood production over a range of levels or across all levels, and these genes are found to regulate lymphoid but not myeloid production. Other genes exhibit one or more clear peaks of association. Genes with overlapping peaks are found to be coexpressed in other tissues and share similar molecular functions and regulatory motifs. By dissecting intercellular variations, our findings revealed four quantitative association patterns that reflect distinct dose-response molecular mechanisms modulating the blood cell production of HSCs.

Examples of previous studies that showed the relevant functions of the genes that we identified as associated with HSC self-renewal or granulocyte production.

Fig. S2 .
Fig. S2.Comparing the self-renewal of HSCs derived from the same ancestor.(A) Number of detected clones in each primary recipient (orange triangle) and its corresponding secondary recipients (blue diamonds) aligned vertically.(B) Shown are the abundances of HSC clones from secondary (2°) recipients that share a common primary recipient.Each open circle represents an HSC clone.Clones from eight secondary recipients are plotted.Shown are pairwise comparisons among all possible mouse pairs for each clone."r" depicts Pearson correlation coefficient.

Fig. S3 .
Fig. S3.One-to-multiple serial transplantation experiment using purified HSCs.Pearson correlation coefficient for the clonal abundance of granulocytes (A) or B cells (B) between one primary recipient and one secondary recipient (1° vs. 2°) and between two secondary recipients (2° vs. 2°).Each marker represents a comparison between a pair of mice.Experiments were performed similarly to those shown in Fig. 1, except that FACS-purified HSCs were used as donor cells for the secondary transplantation.* P < 0.05, *** P < 0.001.
Fig. S5.Fewer genes were identified when data from fewer cells were used in the analyses.Shown are the number of genes that significantly associated with HSC lineage output at false positive scores (FPS) less than 0.05 using different number of cells and the algorithm outlined in Fig.3A.

Fig. S7 .
Fig. S7.Examples of quantitative comparison between gene expression and lineage output of individualHSCs.Shown are four example genes as in Fig.5B.Each color represents data from one mouse.In the left panels, each dot represents data from one cell.Some dots overlap, particularly those with expression levels at 0. In the right panels, the lineage output values on the x-axis are split into ten equal bins, and the violin plot shows the distribution of the gene expression levels within each bin.

Fig. S8 .
Fig. S8.Classification of quantitative association patterns.The Euclidean distance distribution and cutoff threshold for classifying genes that are significantly associated with granulocyte production (A), HSC self-renewal (B), total of granulocyte (Gr) and B cell production (C), and B cell / granulocyte lineage bias (D).See Fig. 5A for B cell production and more details.
Fig. S9.Quantitative association patterns between lineage output and gene expression across individual HSCs.Shown are all genes identified as significantly associated with HSC self-renewal (A), granulocyte production (B), B cell production (C), total of granulocyte and B cell production (D), and B cell / granulocyte lineage bias (E)." †" in (A) denotes genes highlighted in Fig. 6B."#" in (C) denotes genes highlighted in Fig. 6A.

Fig. S11 .
Fig. S11.Cell cycle distribution of HSCs with high or low levels of self-renewal.(A) Pie chart and (B) bar plot showing the cell cycle distribution of HSCs that are on the left or right side of the yellow highlighted line in Fig. 6B, which corresponds to HSCs with low or high levels of self-renewal, respectively.

Fig. S12 .
Fig. S12.Transcription factors with the most similar binding motifs as those in Fig. 7A.Shown are top three best matched transcription factors in mice.

Fig. S13 .
Fig. S13.Analyzing human homologs of genes with overlapping association peaks highlighted in yellow in Fig. 6B.(A) Common motif analysis was performed similarly as in Fig. 7A.The human homologs of the background genes in Fig. 7A were used as background.(B) Pairwise comparison of the transcription levels as measured by mRNA microarrays across various human tissues.Analysis was performed similarly as in Fig. 7C(http://hegemon.ucsd.edu/Tools/explore.php?key=global, Dataset "Human U133 Plus 2.0").One of the seven highlighted genes (2810417H13Rik) was not found in the dataset.Each dot represents data from one microarray analysis.The position of the dot shows the normalized expression levels of the corresponding genes.